Mathematical Properties and Analysis of Google ’ s PageRank
نویسندگان
چکیده
To determine the order in which to display web pages, the search engine Google computes the PageRank vector, whose entries are the PageRanks of the web pages. The PageRank vector is the stationary distribution of a stochastic matrix, the Google matrix. The Google matrix in turn is a convex combination of two stochastic matrices: one matrix represents the link structure of the web graph and a second, rank-one matrix, mimics the random behaviour of web surfers and can also be used to combat web spamming. As a consequence, PageRank depends mainly the link structure of the web graph, but not on the contents of the web pages. We analyze the sensitivity of PageRank to changes in the Google matrix, including addition and deletion of links in the web graph. Due to the proliferation of web pages, the dimension of the Google matrix most likely exceeds ten billion. One of the simplest and most storage-efficient methods for computing PageRank is the power method. We present error bounds for the iterates of the power method and for their residuals. Palabras clave : Markov matrix, stochastic matrix, stationary distribution, power method, perturbation bounds Clasificación por materias AMS : 15A51,65C40,65F15,65F50,65F10
منابع مشابه
PageRank of integers
We build up a directed network tracing links from a given integer to its divisors and analyze the properties of the Google matrix of this network. The PageRank vector of this matrix is computed numerically and it is shown that its probability is inversely proportional to the PageRank index thus being similar to the Zipf law and the dependence established for the World Wide Web. The spectrum of ...
متن کاملSpectral properties of the Google matrix of the World Wide Web and other directed networks
We study numerically the spectrum and eigenstate properties of the Google matrix of various examples of directed networks such as vocabulary networks of dictionaries and university World Wide Web networks. The spectra have gapless structure in the vicinity of the maximal eigenvalue for Google damping parameter α equal to unity. The vocabulary networks have relatively homogeneous spectral densit...
متن کاملGoogle matrix and Ulam networks of intermittency maps
We study the properties of the Google matrix of an Ulam network generated by intermittency maps. This network is created by the Ulam method which gives a matrix approximant for the Perron-Frobenius operator of dynamical map. The spectral properties of eigenvalues and eigenvectors of this matrix are analyzed. We show that the PageRank of the system is characterized by a power law decay with the ...
متن کاملPageRank algorithm and Monte Carlo methods in PageRank Computation
PageRank is the algorithm used by the Google search engine for ranking web pages. PageRank Algorithm calculates for each page a relative importance score which can be interpreted as the frequency of how often a page is visited by a surfer. The purpose of this work is to provide a mathematical analysis of the PageRank Algorithm. We analyze the random surfer model and the linear algebra behind it...
متن کاملTime evolution of Wikipedia network ranking
Abstract. We study the time evolution of ranking and spectral properties of the Google matrix of English Wikipedia hyperlink network during years 2003 2011. The statistical properties of ranking of Wikipedia articles via PageRank and CheiRank probabilities, as well as the matrix spectrum, are shown to be stabilized for 2007 2011. A special emphasis is done on ranking of Wikipedia personalities ...
متن کامل